Integrated Strategy Improves the Prediction Accuracy of miRNA in Large Dataset
نویسندگان
چکیده
MiRNAs are short non-coding RNAs of about 22 nucleotides, which play critical roles in gene expression regulation. The biogenesis of miRNAs is largely determined by the sequence and structural features of their parental RNA molecules. Based on these features, multiple computational tools have been developed to predict if RNA transcripts contain miRNAs or not. Although being very successful, these predictors started to face multiple challenges in recent years. Many predictors were optimized using datasets of hundreds of miRNA samples. The sizes of these datasets are much smaller than the number of known miRNAs. Consequently, the prediction accuracy of these predictors in large dataset becomes unknown and needs to be re-tested. In addition, many predictors were optimized for either high sensitivity or high specificity. These optimization strategies may bring in serious limitations in applications. Moreover, to meet continuously raised expectations on these computational tools, improving the prediction accuracy becomes extremely important. In this study, a meta-predictor mirMeta was developed by integrating a set of non-linear transformations with meta-strategy. More specifically, the outputs of five individual predictors were first preprocessed using non-linear transformations, and then fed into an artificial neural network to make the meta-prediction. The prediction accuracy of meta-predictor was validated using both multi-fold cross-validation and independent dataset. The final accuracy of meta-predictor in newly-designed large dataset is improved by 7% to 93%. The meta-predictor is also proved to be less dependent on datasets, as well as has refined balance between sensitivity and specificity. This study has two folds of importance: First, it shows that the combination of non-linear transformations and artificial neural networks improves the prediction accuracy of individual predictors. Second, a new miRNA predictor with significantly improved prediction accuracy is developed for the community for identifying novel miRNAs and the complete set of miRNAs. Source code is available at: https://github.com/xueLab/mirMeta.
منابع مشابه
Identification of miR-24 and miR-137 as novel candidate multiple sclerosis miRNA biomarkers using multi-staged data analysis protocol
Many studies have investigated misregulation of miRNAs relevant to multiple sclerosis (MS) pathogenesis. Abnormal miRNAs can be used both as candidate biomarker for MS diagnosis and understanding the disease miRNA-mRNA regulatory network. In this comprehensive study, misregulated miRNAs related to MS were collected from existing literature, databases and via in silico prediction. A multi-staged...
متن کاملPREDICTION OF EARTHQUAKE INDUCED DISPLACEMENTS OF SLOPES USING HYBRID SUPPORT VECTOR REGRESSION WITH PARTICLE SWARM OPTIMIZATION
Displacements induced by earthquake can be very large and result in severe damage to earth and earth supported structures including embankment dams, road embankments, excavations and retaining walls. It is important, therefore, to be able to predict such displacements. In this paper, a new approach to prediction of earthquake induced displacements of slopes (EIDS) using hybrid support vector re...
متن کاملSSCMDA: spy and super cluster strategy for MiRNA-disease association prediction
In the biological field, the identification of the associations between microRNAs (miRNAs) and diseases has been paid increasing attention as an extremely meaningful study for the clinical medicine. However, it is expensive and time-consuming to confirm miRNA-disease associations by experimental methods. Therefore, in recent years, several effective computational models for predicting the poten...
متن کاملA study on the accuracy of motion tracking of thoracic tumors at radiotherapy with external surrogates
Introduction: In radiotherapy with external surrogates, exact information of tumor position is one of the key factors that improves treatment delivery. Many dynamic tumors in thorax region of patient move mainly due to respiration and are known as intra-fractional motion error that must be compensated, as well. One of clinical strategy is using Stereotactic Body Radiation Thera...
متن کاملPrediction of both conserved and nonconserved microRNA targets in animals
MOTIVATION MicroRNAs (miRNAs) are involved in many diverse biological processes and they may potentially regulate the functions of thousands of genes. However, one major issue in miRNA studies is the lack of bioinformatics programs to accurately predict miRNA targets. Animal miRNAs have limited sequence complementarity to their gene targets, which makes it challenging to build target prediction...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 11 شماره
صفحات -
تاریخ انتشار 2016